Automatically Extracting Typical Syntactic Differences from Corpora
نویسندگان
چکیده
منابع مشابه
Automatically Extracting Typical Syntactic Differences from Corpora
We develop an aggregate measure of syntactic difference for automatically finding common syntactic differences between collections of text. With the use of this measure it is possible to mine for differences between for example, the English of learners and natives, or between related dialects. If formulated in advance, hypotheses can also be tested for statistical significance. It enables us to...
متن کاملExtracting semantic relations from Portuguese corpora using lexical-syntactic patterns
The growing investment on automatic extraction procedures, together with the need for extensive resources, makes semi-automatic construction a new viable and efficient strategy for developing of language resources, combining accuracy, size, coverage and applicability. These assumptions motivated the work depicted in this paper, aiming at the establishment and use of lexical-syntactic patterns f...
متن کاملAutomatically enriching spoken corpora with syntactic information for linguistic studies
Syntactic parsing of speech transcriptions faces the problem of the presence of disfluencies that break the syntactic structure of the utterances. We propose in this paper two solutions to this problem. The first one relies on a disfluencies predictor that detects disfluencies and removes them prior to parsing. The second one integrates the disfluencies in the syntactic structure of the utteran...
متن کاملDetecting Syntactic Substratum Effects Automatically in Interlanguage Corpora
This paper applies techniques to obtain an aggregate measure of syntactic distance between two varieties of English spoken by firstand second-generation Finnish Australians and examines the degree of what we call syntactic ‘contamination’ in the two. Our general goal is to detect the linguistic sources of the variation between the two groups and interpret the findings from (at least) two perspe...
متن کاملAutomatically Inducing Ontologies From Corpora
The emergence of vast quantities of on-line information has raised the importance of methods for automatic cataloguing of information in a variety of domains, including electronic commerce and bioinformatics. Ontologies can play a critical role in such cataloguing. In this paper, we describe a system that automatically induces an ontology from any large on-line text collection in a specific dom...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Literary and Linguistic Computing
سال: 2010
ISSN: 0268-1145,1477-4615
DOI: 10.1093/llc/fqq017